All credits to: https://github.com/fchollet/keras/blob/master/examples/mnist_mlp.py. The following code is a modified version of the above. The error rate is about 0.9% after 10 epochs of training.
In [1]:
import numpy as np
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import np_utils
In [2]:
# Fix random seed for reproducibility
seed = 7
np.random.seed(seed)
# Load data
(X_train, y_train), (X_test, y_test) = mnist.load_data()
In [3]:
# Flatten 28*28 images to a 784 vector for each image
num_pixels = X_train.shape[1] * X_train.shape[2]
X_train = X_train.reshape(X_train.shape[0], num_pixels).astype('float32')
X_test = X_test.reshape(X_test.shape[0], num_pixels).astype('float32')
num_pixels is equal to 748
X_train will have the shape (60000, 748)
X_test will have the shape (10000, 748)
In [4]:
# Normalize inputs from 0-255 to 0-1
X_train = X_train / 255
X_test = X_test / 255
In [5]:
# one-hot-encode outputs (Bsp: 2 --> [0,0,1,0,0,0,0,0,0,0])
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]
one-hot-encoding is used because in the network, there is one neuron for one number...
To predict the networks output, one takes the index of the most active neuron and thereby converts the one-hot-vector back into a number.
In [6]:
def model():
# Create model
model = Sequential()
model.add(Dense(num_pixels, input_dim=num_pixels, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
'softmax' is a sigmoid shaped curve
'categorical_crossentropy' is the used loss-function or error-function
'adam' is the specified way of performing gradient descent
In [8]:
# Build the model
model = model()
# Fit the model
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10, batch_size=200, verbose=2)
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("Error: %.2f%%" % (100-scores[1]*100))